SwiftTuna: Incrementally Exploring Large-scale Multidimensional Data

نویسندگان

  • Jaemin Jo
  • Wonjae Kim
  • Seunghoon Yoo
  • Bohyoung Kim
  • Jinwook Seo
چکیده

The advance in distributed computing technologies opens up new possibilities of data exploration even for datasets with a few billion entries. In this paper, we present SwiftTuna, an interactive system that brings in modern cluster computing technologies (i.e., inmemory computing) to InfoVis, allowing rapid and incremental exploration of large-scale multidimensional data without building precomputed data structures (e.g., data cubes). Our performance evaluation demonstrates that SwiftTuna enables data exploration of a real-world dataset with four billion records while preserving the latency between incremental responses within a few seconds.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Efficient Encoding Scheme to Handle the Address Space Overflow for Large Multidimensional Arrays

We present a new implementation scheme of multidimensional array for handling large scale high dimensional datasets that grows incrementally. The scheme implements a dynamic multidimensional extendible array employing a set of two dimensional extendible arrays. The multidimensional arrays provide many advantages but it has some problems as well. The Traditional Multidimensional array is not dyn...

متن کامل

Mining Clickstream-Based Data Cubes

Clickstream analysis can reveal usage patterns on company’s web sites giving highly improved understanding of customer behaviour. This can be used to improve customer satisfaction with the website and the company in general, yielding a great business advantage. Such information has to be extracted from very large collections of clickstreams in web sites. This is challenging data mining, both in...

متن کامل

Exploring Scientific Discovery with Large-Scale Parallel Scripting

Scientists and the organizations that fund scientific research frequently face difficult questions about how to allocate scarce resources. Should they pursue safe avenues of investigation that incrementally extend current knowledge? Or should they pursue ideas that are far off the beaten track, which are less likely to bear fruit, but more likely to provide revolutionary insights? One group at ...

متن کامل

Exploring Industrial Data Repositories: Where Software Development Approaches Meet

Lots of data are gathered during the lifetime of a product or project in different data repositories that may be part of a measurement program or not. Analyzing this data is useful in exploring relations, verifying hypotheses or theories, and in evaluating and improving companies’ data collection systems. The paper presents a method for exploring industrial data repositories in empirical resear...

متن کامل

Path Planning with Incremental Roadmap Update for Large Environments

Recent research results suggest that one can incorporate motion-planning techniques into the control loop of 3D navigation or tele-operation for more efficient navigation. However, the motion planner with this approach may not scale up well for large workspaces. In this paper, we propose a novel approach to overcome this scalability problem. We limit the region of interest for path-finding to a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016